17 research outputs found

    A new Automatic Formant Tracking approach based on scalogram maxima detection using complex wavelets

    Get PDF
    International audienceIn this paper we present a new formant tracking algorithm where the formant frequencies estimation was based on local maxima detection of a time frequency representation. This representation can be shown by a scalogram issued from a complex wavelet transform. The formant frequency candidates are validated as local maxima of scalogram which correspond to wavelet ridges. Then in the proposed algorithm, we have introduced the computation of center of gravity as tracking constraint. We tested our new algorithm by applying it on synthesized and natural voiced speech signals. The formant trajectories obtained by our algorithm were compared to those of manually-edited ones of our Arabic database as reference; those given by Fourier transform method and the LPC analysis used in Praat. The comparison of the results showed globally the adequacy of the first three formant trajectories using complex Morlet wavelet refers to the manually-edited formant tracks

    An Evaluation of Formant Tracking methods on an Arabic Database

    Get PDF
    International audienceIn this paper we present a formant database of Arabic used to evaluate our new automatic formant tracking algorithm based on Fourier ridges detection. In this method we have introduced a continuity constraint based on the computation of centres of gravity for a set of formant candidates. This leads to connect a frame of speech to its neighbours and thus improves the robustness of tracking. The formant trajectories obtained by the algorithm proposed are compared to those of the hand edited formant database and those given by Praat with LPC data

    Unsupervised Domain Adaptation Using Generative Adversarial Networks for Semantic Segmentation of Aerial Images

    Get PDF
    Segmenting aerial images is of great potential in surveillance and scene understanding of urban areas. It provides a mean for automatic reporting of the different events that happen in inhabited areas. This remarkably promotes public safety and traffic management applications. After the wide adoption of convolutional neural networks methods, the accuracy of semantic segmentation algorithms could easily surpass 80% if a robust dataset is provided. Despite this success, the deployment of a pretrained segmentation model to survey a new city that is not included in the training set significantly decreases accuracy. This is due to the domain shift between the source dataset on which the model is trained and the new target domain of the new city images. In this paper, we address this issue and consider the challenge of domain adaptation in semantic segmentation of aerial images. We designed an algorithm that reduces the domain shift impact using generative adversarial networks (GANs). In the experiments, we tested the proposed methodology on the International Society for Photogrammetry and Remote Sensing (ISPRS) semantic segmentation dataset and found that our method improves overall accuracy from 35% to 52% when passing from the Potsdam domain (considered as source domain) to the Vaihingen domain (considered as target domain). In addition, the method allows efficiently recovering the inverted classes due to sensor variation. In particular, it improves the average segmentation accuracy of the inverted classes due to sensor variation from 14% to 61%.info:eu-repo/semantics/publishedVersio

    Evaluation d'une nouvelle méthode de suivi de formants sur un corpus Arabe

    Get PDF
    National audienceThis paper develops a formant tracking technique based on Fourier ridges detection. In this method we have introduced a constraint of tracking based on the computation of centre of gravity for a set of frequency formant candidates which leads to connect a frame of speech to its neighbours and thus to improve the robustness of tracking. The formant trajectories obtained by the algorithm proposed are compared to those of a hand edited formant Arabic database, created especially for this work, and those given by Praat with LPC data

    Arabic Pharyngeals in Visual Speech

    No full text
    Many perceptual experiments show that human talkers provide more intelligible visual speech than synthetic talkers. This inferiority of synthetic visual speech might be due to a lack of finer modeling of the parts of the face that are important to lipreading or that some parts of the face that are not generally considered as relevant to visual speech or as not visible in face-to-face communication, might actually provide some information, which humans are capable of decoding. This information might therefore not be modeled accurately in the synthetic speaker. In this paper, we provide evidence from Arabic that some sounds, which are not known as visible, might be recognized correctly visually. We performed a lipreading recognition experiment on Arabic, where a set of consonant-vowel stimuli were presented as visual-only speech and participants were asked to report what they recognized. The resulting consonant confusion matrix shows that some of these pharyngeals were, to some extent, well discriminated. Results are discussed based on th

    Aspects of Visual Speech in Arabic

    No full text
    International audienceIn this paper, we present a study of visual speech in Arabic. More specifically, we performed a lipreading recognition experiment on Arabic, where a set of consonant-vowel stimuli were presented as visual-only speech and participants were asked to report what they recognized. The overall lipreading scores were consistent with other experiments in other languages. The resulting consonant confusion matrix shows that some of the phonemes were well discriminated, however, for others it depends on the context. Results are discussed based on the category of phonemes and the vowel context

    Evaluation of Automatic Formant Tracking Method Using Fourier Ridges

    No full text
    International audienceThis paper develops a formant tracking technique based on Fourier ridges detection. This work aims at improving the performance of formant tracking algorithms. In this method, we have introduced a continuity constraint based on the computation of the centre of gravity for a set of formant candidates, which leads to link a frame of speech with its neighbours and thus improves the robustness of tracking. The formant trajectories obtained by the algorithm proposed are compared with those of a hand edited formant Arabic database, created especially for this work, and those given by Praat with LPC data
    corecore